274 research outputs found

    Normalization of array-CGH data: influence of copy number imbalances

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>High-resolution microarray-based comparative genomic hybridization (CGH) techniques have successfully been applied to study copy number imbalances in a number of settings such as the analysis of cancer genomes. For normalization of array-CGH data, methods initially developed for gene expression microarray analysis have, in general, been directly adopted and used. However, these methods are designed to work under assumptions that may not be valid for array-CGH data when copy number imbalances are present. We therefore sought to investigate the effect on normalization imposed by copy number imbalances.</p> <p>Results</p> <p>Here we demonstrate that copy number imbalances correlate with intensity in array-CGH data thereby causing problems for conventional normalization methods. We propose a strategy to circumvent these problems by taking copy number imbalances into account during normalization, and we test the proposed strategy using several data sets from the analysis of cancer genomes. In addition, we show how the strategy can be applied to conveniently define adaptive sample-specific boundaries between balanced copy number, losses, and gains to facilitate management of variation in tissue heterogeneity when calling copy number changes.</p> <p>Conclusion</p> <p>We highlight the importance of considering copy number imbalances during normalization of array-CGH data, and show how failure to do so can deleteriously affect data and hamper interpretation.</p

    Non-coding antisense transcription detected by conventional and single-stranded cDNA microarray

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Recent studies revealed that many mammalian protein-coding genes also transcribe their complementary strands. This phenomenon raises questions regarding the validity of data obtained from double-stranded cDNA microarrays since hybridization to both strands may occur. Here, we wanted to analyze experimentally the incidence of antisense transcription in human cells and to estimate their influence on protein coding expression patterns obtained by double-stranded microarrays. Therefore, we profiled transcription of sense and antisense independently by using strand-specific cDNA microarrays.</p> <p>Results</p> <p>Up to 88% of expressed protein coding loci displayed concurrent expression from the complementary strand. Antisense transcription is cell specific and showed a strong tendency to be positively correlated to the expression of the sense counterparts. Even if their expression is wide-spread, detected antisense signals seem to have a limited distorting effect on sense profiles obtained with double-stranded probes.</p> <p>Conclusion</p> <p>Antisense transcription in humans can be far more common than previously estimated. However, it has limited influence on expression profiles obtained with conventional cDNA probes. This can be explained by a biological phenomena and a bias of the technique: a) a co-ordinate sense and antisense expression variation and b) a bias for sense-hybridization to occur with more efficiency, presumably due to variable exonic overlap between antisense transcripts.</p

    Normalization of Illumina Infinium whole-genome SNP data improves copy number estimates and allelic intensity ratios

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Illumina Infinium whole genome genotyping (WGG) arrays are increasingly being applied in cancer genomics to study gene copy number alterations and allele-specific aberrations such as loss-of-heterozygosity (LOH). Methods developed for normalization of WGG arrays have mostly focused on diploid, normal samples. However, for cancer samples genomic aberrations may confound normalization and data interpretation. Therefore, we examined the effects of the conventionally used normalization method for Illumina Infinium arrays when applied to cancer samples.</p> <p>Results</p> <p>We demonstrate an asymmetry in the detection of the two alleles for each SNP, which deleteriously influences both allelic proportions and copy number estimates. The asymmetry is caused by a remaining bias between the two dyes used in the Infinium II assay after using the normalization method in Illumina's proprietary software (BeadStudio). We propose a quantile normalization strategy for correction of this dye bias. We tested the normalization strategy using 535 individual hybridizations from 10 data sets from the analysis of cancer genomes and normal blood samples generated on Illumina Infinium II 300 k version 1 and 2, 370 k and 550 k BeadChips. We show that the proposed normalization strategy successfully removes asymmetry in estimates of both allelic proportions and copy numbers. Additionally, the normalization strategy reduces the technical variation for copy number estimates while retaining the response to copy number alterations.</p> <p>Conclusion</p> <p>The proposed normalization strategy represents a valuable tool that improves the quality of data obtained from Illumina Infinium arrays, in particular when used for LOH and copy number variation studies.</p

    The gene expression landscape of breast cancer is shaped by tumor protein p53 status and epithelial-mesenchymal transition

    Get PDF
    Introduction: Gene expression data derived from clinical cancer specimens provide an opportunity to characterize cancer-specific transcriptional programs. Here, we present an analysis delineating a correlation-based gene expression landscape of breast cancer that identifies modules with strong associations to breast cancer-specific and general tumor biology. Methods: Modules of highly connected genes were extracted from a gene co-expression network that was constructed based on Pearson correlation, and module activities were then calculated using a pathway activity score. Functional annotations of modules were experimentally validated with an siRNA cell spot microarray system using the KPL-4 breast cancer cell line, and by using gene expression data from functional studies. Modules were derived using gene expression data representing 1,608 breast cancer samples and validated in data sets representing 971 independent breast cancer samples as well as 1,231 samples from other cancer forms. Results: The initial co-expression network analysis resulted in the characterization of eight tightly regulated gene modules. Cell cycle genes were divided into two transcriptional programs, and experimental validation using an siRNA screen showed different functional roles for these programs during proliferation. The division of the two programs was found to act as a marker for tumor protein p53 (TP53) gene status in luminal breast cancer, with the two programs being separated only in luminal tumors with functional p53 (encoded by TP53). Moreover, a module containing fibroblast and stroma-related genes was highly expressed in fibroblasts, but was also up-regulated by overexpression of epithelial-mesenchymal transition factors such as transforming growth factor beta 1 (TGF-beta1) and Snail in immortalized human mammary epithelial cells. Strikingly, the stroma transcriptional program related to less malignant tumors for luminal disease and aggressive lymph node positive disease among basal-like tumors. Conclusions: We have derived a robust gene expression landscape of breast cancer that reflects known subtypes as well as heterogeneity within these subtypes. By applying the modules to TP53-mutated samples we shed light on the biological consequences of non-functional p53 in otherwise low-proliferating luminal breast cancer. Furthermore, as in the case of the stroma module, we show that the biological and clinical interpretation of a set of co-regulated genes is subtype-dependent

    An integrated genomics analysis of epigenetic subtypes in human breast tumors links DNA methylation patterns to chromatin states in normal mammary cells.

    Get PDF
    To access publisher's full text version of this article, please click on the hyperlink in Additional Links field or click on the hyperlink at the top of the page marked Files. This article is open access.Aberrant DNA methylation is frequently observed in breast cancer. However, the relationship between methylation patterns and the heterogeneity of breast cancer has not been comprehensively characterized.Whole-genome DNA methylation analysis using Illumina Infinium HumanMethylation450 BeadChip arrays was performed on 188 human breast tumors. Unsupervised bootstrap consensus clustering was performed to identify DNA methylation epigenetic subgroups (epitypes). The Cancer Genome Atlas data, including methylation profiles of 669 human breast tumors, was used for validation. The identified epitypes were characterized by integration with publicly available genome-wide data, including gene expression levels, DNA copy numbers, whole-exome sequencing data, and chromatin states.We identified seven breast cancer epitypes. One epitype was distinctly associated with basal-like tumors and with BRCA1 mutations, one epitype contained a subset of ERBB2-amplified tumors characterized by multiple additional amplifications and the most complex genomes, and one epitype displayed a methylation profile similar to normal epithelial cells. Luminal tumors were stratified into the remaining four epitypes, with differences in promoter hypermethylation, global hypomethylation, proliferative rates, and genomic instability. Specific hyper- and hypomethylation across the basal-like epitype was rare. However, we observed that the candidate genomic instability drivers BRCA1 and HORMAD1 displayed aberrant methylation linked to gene expression levels in some basal-like tumors. Hypomethylation in luminal tumors was associated with DNA repeats and subtelomeric regions. We observed two dominant patterns of aberrant methylation in breast cancer. One pattern, constitutively methylated in both basal-like and luminal breast cancer, was linked to genes with promoters in a Polycomb-repressed state in normal epithelial cells and displayed no correlation with gene expression levels. The second pattern correlated with gene expression levels and was associated with methylation in luminal tumors and genes with active promoters in normal epithelial cells.Our results suggest that hypermethylation patterns across basal-like breast cancer may have limited influence on tumor progression and instead reflect the repressed chromatin state of the tissue of origin. On the contrary, hypermethylation patterns specific to luminal breast cancer influence gene expression, may contribute to tumor progression, and may present an actionable epigenetic alteration in a subset of luminal breast cancers.Swedish Cancer Society Swedish Research Counci

    Tumor Genome Wide DNA Alterations Assessed by Array CGH in Patients with Poor and Excellent Survival Following Operation for Colorectal Cancer

    Get PDF
    Genome wide DNA alterations were evaluated by array CGH in addition to RNA expression profiling in colorectal cancer from patients with excellent and poor survival following primary operations

    Relation between smoking history and gene expression profiles in lung adenocarcinomas

    Get PDF
    Background: Lung cancer is the worldwide leading cause of death from cancer. Tobacco usage is the major pathogenic factor, but all lung cancers are not attributable to smoking. Specifically, lung cancer in never-smokers has been suggested to represent a distinct disease entity compared to lung cancer arising in smokers due to differences in etiology, natural history and response to specific treatment regimes. However, the genetic aberrations that differ between smokers and never-smokers' lung carcinomas remain to a large extent unclear. Methods: Unsupervised gene expression analysis of 39 primary lung adenocarcinomas was performed using Illumina HT-12 microarrays. Results from unsupervised analysis were validated in six external adenocarcinoma data sets (n=687), and six data sets comprising normal airway epithelial or normal lung tissue specimens (n=467). Supervised gene expression analysis between smokers and never-smokers were performed in seven adenocarcinoma data sets, and results validated in the six normal data sets. Results: Initial unsupervised analysis of 39 adenocarcinomas identified two subgroups of which one harbored all never-smokers. A generated gene expression signature could subsequently identify never-smokers with 79-100% sensitivity in external adenocarcinoma data sets and with 76-88% sensitivity in the normal materials. A notable fraction of current/former smokers were grouped with never-smokers. Intriguingly, supervised analysis of never-smokers versus smokers in seven adenocarcinoma data sets generated similar results. Overlap in classification between the two approaches was high, indicating that both approaches identify a common set of samples from current/former smokers as potential never-smokers. The gene signature from unsupervised analysis included several genes implicated in lung tumorigenesis, immune-response associated pathways, genes previously associated with smoking, as well as marker genes for alveolar type II pneumocytes, while the best classifier from supervised analysis comprised genes strongly associated with proliferation, but also genes previously associated with smoking. Conclusions: Based on gene expression profiling, we demonstrate that never-smokers can be identified with high sensitivity in both tumor material and normal airway epithelial specimens. Our results indicate that tumors arising in never-smokers, together with a subset of tumors from smokers, represent a distinct entity of lung adenocarcinomas. Taken together, these analyses provide further insight into the transcriptional patterns occurring in lung adenocarcinoma stratified by smoking history
    corecore